327 research outputs found
Learning and Interpreting Multi-Multi-Instance Learning Networks
We introduce an extension of the multi-instance learning problem where
examples are organized as nested bags of instances (e.g., a document could be
represented as a bag of sentences, which in turn are bags of words). This
framework can be useful in various scenarios, such as text and image
classification, but also supervised learning over graphs. As a further
advantage, multi-multi instance learning enables a particular way of
interpreting predictions and the decision function. Our approach is based on a
special neural network layer, called bag-layer, whose units aggregate bags of
inputs of arbitrary size. We prove theoretically that the associated class of
functions contains all Boolean functions over sets of sets of instances and we
provide empirical evidence that functions of this kind can be actually learned
on semi-synthetic datasets. We finally present experiments on text
classification, on citation graphs, and social graph data, which show that our
model obtains competitive results with respect to accuracy when compared to
other approaches such as convolutional networks on graphs, while at the same
time it supports a general approach to interpret the learnt model, as well as
explain individual predictions.Comment: JML
Shift Aggregate Extract Networks
We introduce an architecture based on deep hierarchical decompositions to
learn effective representations of large graphs. Our framework extends classic
R-decompositions used in kernel methods, enabling nested "part-of-part"
relations. Unlike recursive neural networks, which unroll a template on input
graphs directly, we unroll a neural network template over the decomposition
hierarchy, allowing us to deal with the high degree variability that typically
characterize social network graphs. Deep hierarchical decompositions are also
amenable to domain compression, a technique that reduces both space and time
complexity by exploiting symmetries. We show empirically that our approach is
competitive with current state-of-the-art graph classification methods,
particularly when dealing with social network datasets
Classification of cancer pathology reports: a large-scale comparative study
We report about the application of state-of-the-art deep learning techniques
to the automatic and interpretable assignment of ICD-O3 topography and
morphology codes to free-text cancer reports. We present results on a large
dataset (more than 80 000 labeled and 1 500 000 unlabeled anonymized reports
written in Italian and collected from hospitals in Tuscany over more than a
decade) and with a large number of classes (134 morphological classes and 61
topographical classes). We compare alternative architectures in terms of
prediction accuracy and interpretability and show that our best model achieves
a multiclass accuracy of 90.3% on topography site assignment and 84.8% on
morphology type assignment. We found that in this context hierarchical models
are not better than flat models and that an element-wise maximum aggregator is
slightly better than attentive models on site classification. Moreover, the
maximum aggregator offers a way to interpret the classification process.Comment: 10 pages, 6 figures, 3 tables, accepted for publication in IEEE
Journal of Biomedical and Health Informatics (J-BHI
- …